Multi-keyword spotting of telephone speech using a fuzzy search algorithm and keyword-driven two-level CBSM

نویسندگان

  • Chung-Hsien Wu
  • Yeou-Jiunn Chen
چکیده

In telephone speech recognition, the acoustic mismatch between training and testing environments often causes a severe degradation in the recognition performance. This paper presents a keyword-driven two-level codebook-based stochastic matching (CBSM) algorithm to eliminate the acoustic mismatch. Additionally, in Mandarin speech, it is dicult to correctly recognize the unvoiced part in a syllable. In order to reduce the recognition error of unvoiced segments, a fuzzy search algorithm is proposed to extract keyword candidates from a syllable lattice. Finally, a keyword relation and a weighting function for keyword combinations are presented for multi-keyword spotting. In the multikeyword spotting of Mandarin speech, 94 right context-dependent and 38 context-independent subsyllables are used as the basic recognition units. A corresponding anti-subsyllable model for each subsyllable is trained and used for veri®cation. In this system, 2583 faculty names and 39 department names are selected as the primary keywords and the secondary keywords, respectively. Using a testing set of 3088 conversational speech utterances from 33 speakers (20 male, 13 female), these techniques reduced the recognition error rate from 29.6% to 20.6% for multi-keywords embedded in non-keyword speech. Ó 2001 Elsevier Science B.V. All rights reserved.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Robust Multi-Keyword Spotting of Telephone Speech Using Stochastic Matching

In telephone speech recognition, the acoustic mismatch between the training and the test environment often causes severe degradation due to the channel distortion and ambient noise. In this paper, a two-level codebook-based stochastic matching (CBSM) is proposed to deal with the acoustic mismatch. For multi-keyword detection, we define a keyword relation table and a weighting function for reaso...

متن کامل

Telephone speech multi-keyword spotting using fuzzy search algorithm and prosodic verification

In this paper a fuzzy search algorithm is proposed to deal with the recognition error for telephone speech. Since the prosodic information is a very special and important feature for Mandarin speech, we integrate the prosodic information into keyword verification. For multi-keyword detection, we define a keyword relation and a weighting function for reasonable keyword combinations. In the keywo...

متن کامل

A new keyword spotting algorithm with pre-calculated optimal thresholds

Keyword spotting is a very forward-looking and promising branch of speech recognition. This paper presents a HMM-based keyword spotting system, which works with a new algorithm. The first discussion topic is the description of the search algorithm, that needs no representation of the non-keyword parts of the speech signal. For this purpose, the computation of the HMM scores and the Viterbi algo...

متن کامل

Document Image Retrieval Based on Keyword Spotting Using Relevance Feedback

Keyword Spotting is a well-known method in document image retrieval. In this method, Search in document images is based on query word image. In this Paper, an approach for document image retrieval based on keyword spotting has been proposed. In proposed method, a framework using relevance feedback is presented. Relevance feedback, an interactive and efficient method is used in this paper to imp...

متن کامل

A fast fuzzy keyword spotting algorithm based on syllable confusion network

This paper presents a fast fuzzy search algorithm to extract keyword candidates from syllable confusion networks (SCNs) in Mandarin spontaneous speech. Since the recognition accuracy of spontaneous speech is quite poor, syllable confusion matrix (SCM) is applied to compensate for the recognition errors and to improve recall. For fast retrieval, an efficient vocabulary-independent index structur...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Speech Communication

دوره 33  شماره 

صفحات  -

تاریخ انتشار 2001